Adaptive hash retrieval with kernel based similarity

نویسندگان

  • Xiao Bai
  • Cheng Yan
  • Haichuan Yang
  • Lu Bai
  • Jun Zhou
  • Edwin R. Hancock
چکیده

Indexing methods have been widely used for fast data retrieval on large scale datasets. When the data are represented by high dimensional vectors, hashing is often used as an efficient solution for approximate similarity search. When a retrieval task does not involve supervised training data, most hashing methods aim at preserving data similarity defined by a distance metric on the feature vectors. Hash codes generated by these approaches normally maintain the Hamming distance of the data in accordance with the similarity function, but ignore the local details of the distribution of data. This objective is not suitable for k-nearest neighbor search since the similarity to the nearest neighbors can vary significantly for different data samples. In this paper, we present a novel adaptive similarity measure which is consistent with k-nearest neighbor search, and prove that it leads to a valid kernel if the original similarity function is a kernel function. Next we propose a method which calculates hash codes using the kernel function. With a low-rank approximation, our hashing framework is more effective than existing methods that preserve similarity over an arbitrary kernel. The proposed similarity function, hashing framework, and their combination demonstrate significant improvement when compared with several alternative state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diffusion Hashing

With the worldwide spread of the broadband Internet, massive multimedia data including texts, images, and videos are increasing explosively and available for interactive applications over the Internet. At the same time, more and more attention has been paid to aiming at fast retrieval from massive multimedia databases. Hash-based Approximate Nearest Neighbor (ANN) search is a technology that ac...

متن کامل

Set-to-Set Hashing with Applications in Visual Recognition

Visual data, such as an image or a sequence of video frames, is often naturally represented as a point set. In this paper, we consider the fundamental problem of finding a nearest set from a collection of sets, to a query set. This problem has obvious applications in large-scale visual retrieval and recognition, and also in applied fields beyond computer vision. One challenge stands out in solv...

متن کامل

Web Image Re-Ranking Using Hash Based Signatures

This project addresses content-based image retrieval in general, and in particular, focuses on developing a hidden class detection methodology to address effective semantics-intensive image retrieval. In our approach, each image in the database is segmented into classes and contains classified images. We explore the query adaptive ranking to retrieve images. With this representation, model base...

متن کامل

Announcing the Final Examination of Kai Li for the degree of Doctor of Philosophy Time & Location: June 6, 2017 at 10:00 AM in HEC 450 Title: Hashing for Multimedia Similarity Modeling and Large-scale Retrieval

In recent years, the amount of multimedia data such as images, texts, and videos have been growing rapidly on the Internet. Motivated by such trends, this thesis is dedicated to exploiting hashing-based solutions to reveal multimedia data correlations and support intra-media and inter-media similarity search among huge volumes of multimedia data. We start by investigating a hashing-based soluti...

متن کامل

Announcing the Final Examination of Kai Li for the degree of Doctor of Philosophy Time & Location: June 6, 2017 at 10:00 AM in HEC 450 Title: Hashing for Multimedia Similarity Modeling and Large-scale Retrieval

In recent years, the amount of multimedia data such as images, texts, and videos have been growing rapidly on the Internet. Motivated by such trends, this thesis is dedicated to exploiting hashing-based solutions to reveal multimedia data correlations and support intra-media and inter-media similarity search among huge volumes of multimedia data. We start by investigating a hashing-based soluti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition

دوره 75  شماره 

صفحات  -

تاریخ انتشار 2018